Unlocking the Potential of Simulators: Design with RL in Mind
نویسندگان
چکیده
Using Reinforcement Learning (RL) in simulation to construct policies useful in real life is challenging. This is often attributed to the sequential decision making aspect: inaccuracies in simulation accumulate over multiple steps, hence the simulated trajectories diverge from what would happen in reality. In our work we show the need to consider another important aspect: the mismatch in simulating control. We bring attention to the need for modeling control as well as dynamics, since oversimplifying assumptions about applying actions of RL policies could make the policies fail on real-world systems. We design a simulator for solving a pivoting task (of interest in Robotics) and demonstrate that even a simple simulator designed with RL in mind outperforms high-fidelity simulators when it comes to learning a policy that is to be deployed on a real robotic system. We show that a phenomenon that is hard to model – friction – could be exploited successfully, even when RL is performed using a simulator with a simple dynamics and noise model. Hence, we demonstrate that as long as the main sources of uncertainty are identified, it could be possible to learn policies applicable to real systems even using a simple simulator. RL-compatible simulators could open the possibilities for applying a wide range of RL algorithms in various fields. This is important, since currently data sparsity in fields like healthcare and education frequently forces researchers and engineers to only consider sample-efficient RL approaches. Successful simulator-aided RL could increase flexibility of experimenting with RL algorithms and help applying RL policies to real-world settings in fields where data is scarce. We believe that lessons learned in Robotics could help other fields design RL-compatible simulators, so we summarize our experience and conclude with suggestions.
منابع مشابه
A real-time recursive dynamic model for vehicle driving simulators
This paper presents the Real-Time Recursive Dynamics (RTRD) model that is developed for driving simulators. The model could be implemented in the Driving Simulator. The RTRD can also be used for off-line high-speed dynamics analysis, compared with commercial multibody dynamics codes, to speed up mechanical design process. An overview of RTRD is presented in the paper. Basic models for specific ...
متن کاملActive Power Filter Design by a Novel Approach of Multi-Objective Optimization
This paper presents an innovative active power filter design method to simultaneously compensate the current harmonics and reactive power of a nonlinear load. The power filter integrates a passive power filter which is a RL low-pass filter placed in series with the load, and an active power filter which comprises an RL in series with an IGBT based voltage source converter. The filter is assumed...
متن کاملProduction of of Ibuprofen Pellets Containing High Amount of Rate Retarding Eudragit RL Using PEG400 and Investigation of Their Physicomechanical Properties
Objective(s) The aim of this study was to investigate the possibility of production of ibuprofen pellets with high amount of rate retarding polymer by aid of PEG400 as plasticizer. Materials and Methods Polyethylene glycol (PEG400) in concentrations of 1, 3 or 5% w/w with respect to Eudragit RL was used in production of pellets containing 60% ibuprofen and 40% excipient (2% polyvinylpyrrolid...
متن کاملDesign of eudragit RL nanoparticles by nanoemulsion method as carriers for ophthalmic drug delivery of ketotifen fumarate
Objective(s): Ketotifen fumarate (KF) is a selective and noncompetitive histamine antagonist (H1-receptor) that is used topically in the treatment of allergic conditions of rhinitis and conjunctivitis. The aim of this study was to formulate and improve an ophthalmic delivery system of KF.Ocular nanoparticles were prepared with the objective of reducing the frequency of administration and obtain...
متن کاملPorosity Rendering in High-Performance Architecture: Wind-Driven Natural Ventilation and Porosity Distribution Patterns
Natural ventilation is one of the most essential issues in the concept of high-performance architecture. The porosity has a lot to do with wind-phil architecture to meet high efficiency in integrated architectural design and materialization a high-performance building. Natural ventilation performance in porous buildings is influenced by a wide range of interre...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1706.02501 شماره
صفحات -
تاریخ انتشار 2017